Gene prediction by multiple syntenic alignment
نویسندگان
چکیده
Given the increasing number of available genomic sequences, one now faces the task of identifying their functional parts, like the protein coding regions. The gene prediction problem can be addressed in several ways. One of the most promising methods makes use of similarity information between the genomic DNA and previously annotated sequences (proteins, cDNAs and ESTs). Recently, given the huge amount of newly sequenced genomes, new similarity-based methods are being successfully applied in the task of gene prediction. The so-called comparative-based methods lie in the similarities shared by regions of two evolutionary related genomic sequences. Despite the number of different gene prediction approaches in the literature, this problem remains challenging. In this paper we present a new comparative-based approach to the gene prediction problem. It is based on a syntenic alignment of three or more genomic sequences. With syntenic alignment we mean an alignment that is constructed taking into account the fact that the involved sequences include conserved regions intervened by unconserved ones. We have implemented the proposed algorithm in a computer program and confirm the validity of the approach on a benchmark including triples of human, mouse and rat genomic sequences.
منابع مشابه
Gene Prediction by Syntenic Alignment
Abstract. Given the number of available genomic DNA, one now faces the task of identifying the functional parts of such raw sequence data, like the protein-coding regions. The gene prediction problem can be addressed in several ways. The most recently methods make use of the similarities between regions of two unannotated genomic sequences in order to find their genes. In this paper we present ...
متن کاملGenome-Wide Comparative in silico Analysis of Calcium Transporters of Rice and Sorghum
The mechanism of calcium uptake, translocation and accumulation in Poaceae has not yet been fully understood. To address this issue, we conducted genome-wide comparative in silico analysis of the calcium (Ca(2+)) transporter gene family of two crop species, rice and sorghum. Gene annotation, identification of upstream cis-acting elements, phylogenetic tree construction and syntenic mapping of t...
متن کاملTechniques for multi-genome synteny analysis to overcome assembly limitations.
Genome scale synteny analysis, the analysis of relative gene-order conservation between species, can provide key insights into evolutionary chromosomal dynamics, rearrangement rates between species, and speciation analysis. With the rapid availability of multiple genomes, there is a need for efficient solutions to aid in comparative syntenic analysis. Current methods rely on homology assessment...
متن کاملGenomic features in the breakpoint regions between syntenic blocks
MOTIVATION We study the largely unaligned regions between the syntenic blocks conserved in humans and mice, based on data extracted from the UCSC genome browser. These regions contain evolutionary breakpoints caused by inversion, translocation and other processes. RESULTS We suggest explanations for the limited amount of genomic alignment in the neighbourhoods of breakpoints. We discount infe...
متن کاملSLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.
Comparative-based gene recognition is driven by the principle that conserved regions between related organisms are more likely than divergent regions to be coding. We describe a probabilistic framework for gene structure and alignment that can be used to simultaneously find both the gene structure and alignment of two syntenic genomic regions. A key feature of the method is the ability to enhan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Integrative Bioinformatics
دوره 2 شماره
صفحات -
تاریخ انتشار 2005